Logic-Based Models for the Analysis of Cell Signaling Networks<xref rid="fn1"></xref>
نویسندگان
چکیده
Computational models are increasingly used to analyze the operation of complex biochemical networks, including those involved in cell signaling networks. Here we review recent advances in applying logic-based modeling to mammalian cell biology. Logic-based models represent biomolecular networks in a simple and intuitive manner without describing the detailed biochemistry of each interaction. A brief description of several logic-based modeling methods is followed by six case studies that demonstrate biological questions recently addressed using logic-based models and point to potential advances in model formalisms and training procedures that promise to enhance the utility of logic-based methods for studying the relationship between environmental inputs and phenotypic or signaling state outputs of complex signaling networks. With accelerating pace, molecular biology and biochemistry are identifying complex patterns of interactions among intracellular and extracellular biomolecules.With respect to cell signaling in eukaryotes, the focus of this review, complex multicomponent networks involving many shared components govern how a cell will respond to diverse environmental cues. Powerful experimental approaches now exist for identifying components of these networks and for determining their biochemical activities, but understanding the networks as an integrated whole is difficult using intuition alone. Thus, mathematical and computational modeling is increasingly playing a role in data interpretation and attempts to extract general biological understanding (1, 2). Depending on the network studied, the data available, and the questions posed, a diverse spectrum of modeling approaches exists, ranging from the highly abstract to the highly specified (3, 4). The goal of this review is to discuss logic-based modeling, an approach lying midway between the complexity and precision of differential equations on one hand and data-driven regression approaches on the other. Within the spectrum of modeling methods currently being applied to cellular biochemistry, models involving differential equations bear the closest relationship to underlying biochemical rate laws. Sets of coupled ordinary differential equations (ODEs) can effectively represent chemical reactions when the number of molecules is large and mass action approximations are appropriate. Partial differential equations (PDEs) add the ability to represent spatial gradients (5), and stochastic methods make it possible to analyze systems in which the number of molecules is small (6). Networks of differential equations can model the temporal and spatial dynamics of biochemical processes in considerable detail, making it possible to study chemical mechanism and predict network dynamics under various conditions. However, the topology of ODEand PDE-based models (that is, patterns of interaction among the species) must be specified in advance, andmodel output is strongly dependent on the values of free parameters (typically initial protein concentrations and rate constants). Estimating these parameters is a computationally intensive task requiring substantial data. As networks get larger, ODEmodeling becomesmore andmore challenging, andmodels that attempt to capture real biological data are currently limited to a few dozen components. At the other extreme, a very active field for computing graphical representations of biological networks through literature analysis or identification of correlations in high-throughput data has emerged. In these graphs, termed protein interaction networks (PINs or interactomes) or protein signaling networks (PSNs), genes and proteins are represented by nodes and potential interactions by edges (links). The edges can be directional or not and signed (inhibitory/activating) or not and typically represent a wide range of interaction modes from direct physical binding to correlated gene expression (7) or integrated database entries (8). Graphs are an attractive way to summarize diverse relationships among large numbers of biomolecules across multiple organisms, but they are not executable per se and cannot be used to compute input-output relationships. Moreover, network graphs rarely take into account dynamic changes in signaling activities, cell type-specific biochemistry, or context-dependent variations (9). Here, we review logic-based models, which represent a compromise between highly specified differential equation models and protein interaction graphs. Using logic-based methods, it is possible to model interactions among large numbers of protein species and perform model training, model validation, and model-based prediction. The first application of logic-based modeling to biological pathways is credited to Kauffman, who used discrete logic to model the biological process of gene regulation (10). Subsequent work focused on delineating theoretical properties of logic-based models of gene regulation (11, 12). Huang and Ingber were among the first to apply logic-based modeling to cell signaling networks, demonstrating that specific cell phenotypes might correspond to dynamic steady states of a This work was funded by National Institutes of Health Grants P50GM68762 andU54-CA112967 and theDepartment of Defense Institute of Collaborative Biotechnologies. *To whom correspondence should be addressed. Phone: (617) 2521629. Fax: (617) 258-0204. E-mail: [email protected]. Current Topic/Perspective Biochemistry, Vol. 49, No. 15, 2010 3217 logic-based model of intracellular signaling species (13). This example of linking environmental inputs to phenotypic outputs via a logic-based model of a biochemical signaling network has sparked considerable interest in the possibility of harnessing logic-based models to understand the relationship between biochemical signaling network and cell state, reflected in a large number of studies over the past few years (13-33). This review is divided into two sections. In the first, we describe the fundamentals of logic-based modeling; in the second, we discuss six applications of logic-based modeling to eukaryotic biology.We focus on logic-basedmodels of biochemical signaling networks and refer the reader to the literature for amore in-depth explanation of theoretical considerations (34), applications of logic-based models to gene regulatory networks (11), andmodels of intercellular communication (35, 36). REPRESENTING BIOCHEMICAL NETWORKS WITH LOGIC-BASED MODELS What Is a Logic-Based Model? Consider the graphical representation of a signaling network common to protein interaction networks (Figure 1a): the nodes in the graph represent proteins, and the edges represent interactions. Such a graph depicts nodes that interact physically or have correlated expression or genetic profiles (depending on the underlying data source) but do not allow us to explicitly compute the state of activity of individual nodes given different inputs or initial network states. Performing such a calculation requires information about how each node reacts to the activities of its input nodes. In logic-based models, these dependencies are specified by “gates” (Figure 1b) which, in Boolean logic, are specified by “truth tables” that list output states for all possible combinations of input states (Figure 1c). Figure 1d shows the truth tables of the OR, AND, andNOTBoolean logic gates as well as a small network in which gates are assembled to create the AND-NOT logic gate. To illustrate how logic-based modeling can be applied to a biological network, consider a hypothetical representation of epidermal growth factor receptor (EGFR) and several downstream proteins (Figure 1e). This toy network is too simple to be realistic but demonstrates several issues of importance when building a logic-based model. Either epidermal growth factor (EGF) or heregulin (HRG) can bind to and activate EGFR (Figure 1d,e). EGFR then stimulates the Raf/ERK and PI3K/ AKTpathways (themultitude of knownbiochemical interactions in this case are modeled as a single “activating” edge). ERK activity inhibits EGFR-dependent PI3K activation, whereas AKT positively regulates the Raf/ERK pathway (Figure 1d,e). With this information, it is possible to compute the response of the unperturbed network to a given input as well as responses resulting from inhibition of a node (by a drug for example). However, under all simulated conditions [EGF or HRG alone or in combination (Figure 1f)], the network response is the same. This is to be expected because binary logic cannot encode the differential sensitivities of EGFR to EGF and HRG, a point to which we return below. Modeling Nondiscrete Processes Using Logic-Based Approaches. The assumption in Boolean logic that all species are either on or off (state 1 or 0, respectively) is clearly an unrealistic way to represent binding curves or catalytic reactions. Fortunately, logic-based modeling provides several approaches for modeling intermediate states of activity (Figure 2a). Multistate discrete models specify additional levels between 0 and 1, whereas fuzzy logic allows for continuous node states. In fuzzy logic, which has foundwide utility in industrial control systems, a set of user-defined functions transforms discrete logic statements into relationships between continuous inputs and output levels. Other methods of describing discrete or Boolean logic models as continuous or mixed discrete continuous have also been implemented successfully [Figure 2a (dashed lines)] (28, 37, 38). How is a prototypical biological interaction approximated using discrete and nondiscrete logic formalisms? In Figure 2b, a sigmoidal relationship between input and output level [e.g., a protein kinase acting on a substrate (black solid line)] is approximated by binary (red solid line), ternary (green dashed line), and quaternary (blue dashed-dotted line) discrete logic functions. Fuzzy logic and mixed discrete continuous logic can closely approximate the real response (orange dashed line). It is important to note, however, that the increased degree of realism of multistate or fuzzy logic modeling comes at the cost of increased complexity, typically in the form of a threshold or transfer function having free parameters that must be estimated. Figure 1g provides an example of howmultistate discrete logic can be used to represent the differing states of activation of EGFR when exposed to EGF and HRG stimulation, where an additional activation level of “two” indicates that EGFR is more sensitive to EGF than HRG. In the model, addition of HRG alone causesAKTandERKactivity levels to oscillate (Figure 1h, right panel). These oscillations are caused by the negative feedback between ERK and PI3K. However, when either EGF alone or bothEGFandHRGare present (Figure 1h, left panel), EGFR is in activation state two and the negative feedback inhibiting PI3K is absent. Thus, oscillations are not observed. Treatment of Time in a Logic-BasedModel. The presence of oscillations in this and other logic-based networks complicates analysis, and the actual form that the oscillations take depends on the treatment of time during the simulation. Logic-based models represent time with varying degrees of detail. We present this concept graphically in Figure 2c, where eachmodeling formalism is classified according to the detail in its representation in species’ state and time. Table 1 presents a comparison of the approaches in tabular form. The activity of each species in discrete logicbased network simulations is determined by its input node states at some previous time step. The order in which node states are updated results in an implicit treatment of time scales. Two primary node-updating schemes exist: synchronous and asynchronous (12, 39, 40). Synchronous updating updates every node at each time step according to the states of its input nodes at the previous time step, whereas asynchronous updating updates node states in random order. In practical terms, asynchronous updating involves updating an output node on the basis of some of its input nodes at the current andothers at a previous time step. Variants of both synchronous and asynchronous updating exist. Time delays can be specified with synchronous updating, allowing for a more refined description of dynamics. A variant of Abbreviations: EGF, epidermal growth factor; EGFR, epidermal growth factor receptor; ErbB, epidermal growth factor receptor; ERK, extracellular regulated kinase; GAP, GTPase activating protein; HRG, heregulin; IGF1, insulin-like growth factor 1; IKK, IκB kinase; IL1R, interleukin 1R; IL6, interleukin 6; LPS, lipopolysaccharide; MEK, mitogen-activated protein kinase kinase; MK2, mitogen-activated protein kinase-activated protein kinase 2; PBMC, peripheral blood mononuclear cells; PI3K, phosphoinositide 3-kinase; T-LGL, T cell large granular lymphocyte(s); TGFR, transforming growth factor R; Th Cell, T helper cell; TNFR, tumor necrosis factor R. 3218 Biochemistry, Vol. 49, No. 15, 2010 Morris et al. asynchronous updating, mixed asynchronous updating, allows some nodes be updated before others, making it possible to separate time scales of fast (e.g., binding and phosphorylation) and slow (e.g., degradation and transcription) reactions in a manner similar to that of time delays (41). Regardless of the updating scheme, it is frequently observed that logic-based models will settle into an “attractor state” in which states no longer change (logic steady state) or states cycle in a pattern of activity [the oscillations in the example network are an example of a cyclic attractor state (Figure 1h)]. The continuous or mixed FIGURE 1: Examples of a logic-based network. (a) Protein signaling network. Biochemical species are represented as nodes. The interactions between these nodes are indicated with arrows. (b) Logic gate. Precisely how the nodes interact is specified with a simple Boolean logic gate. (c) Truth table specifying the output node given possible combinations of its inputs nodes’ values. (d) Boolean logic gates and their truth tables. If the gates are used in the example network, the interaction is shownon the right.Wealsodescribe theAND-NOTgate,which is used in the example network. We note that, in many applications of logic-based modeling, OR and AND gates are not explicitly indicated with their gate symbols. (e) Example of a logic-based network structure. The model was simulated with synchronous updating using customMatLab (Mathworks, Inc.) code (available as Supporting Information). (f) Network behavior with binary rules. Under initial conditions with different ligand stimulations, the network response was identical because the logic rules did not distinguish between EGF and HRG stimulation. (g) Multistate rule specification. The truth tables are given for each modeled species. These rules specify multiple states. The greater sensitivity of EGFR for EGF than HRG is encoded in the higher level it reaches upon stimulation by EGF. Rules that are different from the binary rules are highlighted. (h) Network behavior with multistate rules given in panel d. The rules specified that EGFR is more sensitive to EGF than HRG. Thus, the behavior differeddepending on the stimulation condition.UnderEGForEGFandHRGstimulation, the states ofERKandAKTwere stabilized whereas they oscillated under HRG stimulation alone. This is because the rules specified that, with the highest level of activation of EGFR (activation state two), the negative feedback byERKdid not effectively inhibit PI3K, whereas withmedium-level activation of EGFR (activation state one accessed with only HRG was present), the negative feedback was effective. Current Topic/Perspective Biochemistry, Vol. 49, No. 15, 2010 3219 discrete continuous methods mentioned previously formulate discrete logic as ordinary differential equations or piecewise linear equations, respectively. This treatment allows one tomodel both species’ state and time as continuous (Figure 2c) but at the cost of increased model complexity. Research into the influence of updating scheme on the segment polarity network of Drosophila melanogaster (42) and the mammalian cell cycle (43) network has demonstrated that the different treatments of time can lead to unique biological interpretations. Generally, the most appropriate updating scheme is dependent on the type of model built as well as the questions that the model is meant to address. Another extension of logic-based modeling aims to incorporate probabilistic interactions (44, 45). Thismethod allows one to account for uncertainty in the knowledge of signaling networks as well as stochasticity in biological systems. Also noteworthy are a number of efforts to apply related formalisms such as Petri nets, cellular automata, etc., to biological networks (46). In some cases, these formalisms can be reduced to logic-based formalisms, and they provide an additional level of abstraction that makes it possible to perform formal network analysis (47). Because these probabilistic and computational techniques involve slightly different considerations compared to what was previously FIGURE 2: Description of logic-based formalisms. (a)Descriptionof various forms of logic-basedmodels. All logic-basedmodels describe species’ interactions in terms of logical statements (or rules).Discrete logic can specify two ormore levels for eachmodeled species, whereas Boolean logic specifies only two levels of each species. In addition to these logic-based formalisms, various methods of describing discrete or Boolean logic models with piece-wise continuous equations (37) or logic-based ODEs (28) have been successfully implemented to represent biochemical signaling networks. (b)Approximationof the input-output relationshipbetweenhypothetical biological species (black solid line)withbinary (red solid line), ternary (green dashed line), and quaternary (blue dashed-dotted line) discrete logic gates as well as fuzzy logic or mixed discretecontinuous formalisms (orange dashed line). Various thresholds could be chosen for each discrete gate; chosen thresholds are purely hypothetical. (c) Plane of granularity in species’ states and treatment of time. Regions containing various logic-based modeling variants are denoted by shaded boxes. Boolean networks (blue region) are binary, but their treatment of time ranges from logic steady state to discrete with delays. Discrete models withmultiple species states (orange region) cover a similar range of possible treatments of time. Fuzzy logicmodels (green region) describe a continuous range of species’ states with the same range of time granularity. Conversion of Boolean or discrete models into logic-based ODEs, piecewise linear, and standardized qualitative dynamical system (purple region) results in models that are continuous in both species’ states and time. Each case study is placed on the landscape according to how it represents the biological system of interest with a logic-based network. 3220 Biochemistry, Vol. 49, No. 15, 2010 Morris et al. discussed, we do not describe them further and instead point the interested reader to the references listed above. This review focuses on a qualitative description of various logic-based formalisms. For readers interested in the actual computational procedures involved in using these methods, an outline is provided as Supplemental Figure 1 (Supporting Information). Additionally, several dedicated software packages have been developed for logic-based modeling of biochemical signaling networks with varying degrees of detail and differing updating schemes; some of these are listed in Table 2. We refer the interested reader to the references in this table for descriptions of each simulation procedure, in particular the quantitative approaches not described here. CASE STUDIES OF APPLICATIONS OF LOGICBASEDMODELS TO BIOCHEMICAL NETWORKS Below we discuss six logic-based models of signal transduction as a means of highlighting different methods, biological questions, and opportunities for future development; we necessarily omit many details. Figure 3a shows a general workflow for applying logic-based modeling to signaling networks and serves as a means of summarizing the key features of each case study. (i) Case studies 1 and 2 involve models built solely from literaturebased prior knowledge (Figure 3b). (ii) Case study 3 involves a comparison of models to data (Figure 3c). (iii) Cases studies 4 and 5 use manual refinement to fit experimental data to a fuzzy (case 4) or Boolean (case 5) logic-based model (Figure 3d). (iv) Case study 6 presents a formalmethod formodel optimization based on refining a literature-based Boolean model against highthroughput data (Figure 3e). Case Study 1: Boolean Logic Model of Leukemic T Cell Large Granular Lymphocytes (29). Zhang et al. use a Boolean network model constructed from the literature to ask which proteins in leukemic T cell large granular lymphocytes (T-LGL) should be inhibited to induce apoptosis. Simulation of a 58-node logic model of the T-LGL survival signaling network is used to address the following questions. (i) What are minimal stimulation conditions that recapitulate observed deregulation of the T-LGL network? (ii) What perturbations might reverse deregulation and promote apoptosis? A literature survey and experimental observations were combined to assemble a Boolean logic network describing signaling in T-LGL that affected cytoskeleton signaling, apoptosis, and proliferation. Simulations were compared when all nodes were free to vary and when some nodes were fixed (i.e., set to active or inactive and not allowed to change during the asynchronous updates). When the appropriate nodes were fixed, the model correctly recapitulated the situation in which leukemic T-LGL failed to undergo activation-induced cell death. Model analysis predicted a minimum set of stimuli that would result in the deregulated survival signaling previously observed in leukemic T-LGL. Experimental inhibition of this network state was shown to induce apoptosis in leukemic but not normal peripheral blood mononuclear cells (PBMC). Intriguingly, the authors identified nodes whose activation or inactivation caused the apoptosis node to be activated. These nodes are potential therapeutic targets for induction of apoptosis in leukemic T-LGL. Chemical knockdown Table 1: Description of Logic Modeling Variants logic modeling variant time treatment detail of species’ states use in biological modeling Boolean discrete time steps binary (13, 15-17, 19, 24, 25, 33); (20, 42, 43, 51, 52) discrete logic discrete time steps multistate; user-defined (21, 31, 32, 38); (31, 53, 54) fuzzy logic discrete time steps or time can be treated as a variable multistate; user-defined and implicit in calculation of output state (14, 18, 30) piecewise linear continuous multistate; user-defined and implicit in equations (42, 55) logic-based ODEs continuous multistate; implicit in ODE equations 28 standardized qualitative dynamical systems continuous multistate; implicit in formalism 23 Discrete time steps could use synchronous or asynchronous updating with or without delay or be examined at steady state. Biochemical signaling network. Genetic network. Table 2: Tools Available for the Logic Modeling of Biochemical Signaling Networks tool type of logic functionality treatment of time refs BooleanNet Boolean, piecewise linear simulation and visualization synchronous, asynchronous,
منابع مشابه
You Get What You Pay For? Self-Construal Influences Price-Quality Judgments<xref ref-type="fn" rid="fn1" ptype="f670034" citart="citart1" /><xref ref-type="fn" rid="fn2" ptype="f670034" citart="citart1" />
متن کامل
Celebrity Contagion and the Value of Objects<xref ref-type="fn" rid="fn1" ptype="f658999" citart="citart1"></xref><xref ref-type="fn" rid="fn2" ptype="f658999" citart="citart1"></xref>
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive...
متن کاملAffect as a Decision-Making System of the Present<xref ref-type="fn" rid="fn1" ptype="f668644" citart="citart1" /><xref ref-type="fn" rid="fn2" ptype="f668644" citart="citart1" />
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. .
متن کاملFinancial Deprivation Prompts Consumers to Seek Scarce Goods<xref ref-type="fn" rid="fn1" ptype="f664038" citart="citart1" /><xref ref-type="fn" rid="fn2" ptype="f664038" citart="citart1" />
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].
متن کاملFeeling Like My Self: Emotion Profiles and Social Identity<xref ref-type="fn" rid="fn1" ptype="f669483" citart="citart1" /><xref ref-type="fn" rid="fn2" ptype="f669483" citart="citart1" />
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. .
متن کاملThe Small-Area Hypothesis: Effects of Progress Monitoring on Goal Adherence<xref ref-type="fn" rid="fn1" ptype="f663827" citart="citart1" /><xref ref-type="fn" rid="fn2" ptype="f663827" citart="citart1" />
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].
متن کامل